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0^ , A variational approach to finite connectivity spin-glass-like models is developed and applied to 

describe the structure of optimal solutions in random satisfiability problems. Our variational scheme 
accurately reproduces the known replica symmetric results and also allows for the inclusion of replica 
^ , symmetry breaking effects. For the 3-SAT problem, we find two transitions as the ratio a of logical 

O ' clauses per Boolean variables increases. At the first one Qs — 3.96, a non-trivial organization of the 

I solution space in geometrically separated clusters emerges. The multiplicity of these clusters as well 

as the typical distances between different solutions are calculated. At the second threshold Qc — 4.48, 
satisfying assignments disappear and a finite fraction Bo — 0.13 of variables are overconstrained and 
take the same values in all optimal (though unsatisfying) assignments. These values have to be 
compared to — 4.27, Bo ~ 0.4 obtained from numerical experiments on small instances. Within 
' the present variational approach, the SAT-UNSAT transition naturally appears as a mixture of a 

C , first and a second order transition. For the mixed 2-|-p-SAT with p < 2/5, the behavior is as expected 

' much simpler: a unique smooth transition from SAT to UNSAT takes place at ac = 1/(1 — p). 
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I. INTRODUCTION 



^ Over the last few years, the computer science community has become increasingly aware of the occurrence of phase 

Q ■ transitions in hard combinatorial problems 0. When some control parameters are tuned, many problems of practical 
O ' importance indeed exhibit drastic changes of their behavior. The interest in such threshold phenomena has been 
enhanced by the observation that instances located at phase boundaries are the most difficult ones to solve. Even 
' NP-coniplete problems (whose solving times are thought to grow exponentially with their sizes) do not behave 
^ , so badly far from the threshold. As a consequence, the results of worst-case complexity theory do not seem to be 

■ much relevant in practice and the need for a typical-case complexity theory has clearly emerged. Recently, the use of 
techniques and concepts of the statistical physics of disordered systems combined with numerical investigations have 
suggested that the nature of the transition taking place could be related to the upsurge of complexity at the threshold 
Q. This conjecture can be best exemplified on the paradigm of combinatorial problems showing a phase transition 
behavior, that is the random /^-Satisfiability (if-SAT) problem. 

■ JT-SAT is defined as follows. Consider N Boolean variables {xi — 0, l}i=i,...,Ar. Choose randomly K among the 
. A'^ possible indices i and then, for each of them, choose a literal that is the corresponding Xi or its negation Xi with 

d ' equal probabilities one half. A clause C is the logical OR of the K previously chosen literals, that is C will be true (or 
^ ' satisfied) if and only if at least one literal is true. Next, repeat this process to obtain M independently chosen clauses 
I {Ci}i=i^...^M and ask for all of them to be true at the same time (i.e. we take the logical AND of the M clauses). 
■ O ] For large instances {M,N oo), numerical simulations and mathematical analysis indicate that the probability of 
^ r finding a logical assignment of the {x^j's satisfying all the clauses falls abruptly from one down to zero when a = M/N 
crosses a critical value ac{K). Above ac{K), not all clauses can be satisfied simultaneously. This scenario is rigorously 
established for 2-SAT which is a polynomial problem and whose threshold ac(2) equals 1 ||^. When if > 3, K-SAT 
is NP-complete. Some upper and lower bounds on ac{K) have been derived and numerical simulations have recently 
allowed to find estimates of Uc, e.g. ac(3) ~ 4.25 — 4.30 |p|,^|j^- 
5^ When combining p.M of clauses of length 3 with (1 — pf.M clauses of length 2, one obtains the so-called 2 -f p-SAT 

^ model, which smoothly interpolates between the instances of the easy-polynomial (2-SAT when p = 0) and the hard- 
exponential (3-SAT when p = 1) classes [||. Statistical mechanics and replica theory show that there is a tricritical 
value po ~ 0.4 separating second-order SAT-UNSAT phase transitions for p < po from random first-order SAT-UNSAT 
phase transitions for p > pq. The change of the nature of the transition results from a change of the structure of the 
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optimal Boolean assignments (satisfying all clauses when a < ac{2+p) or minimizing the number of violated clauses 
for a > ac{2+p)) when crossing the threshold. As shown in Q|, the SAT-UNSAT transition results from the freezing 
of a finite fraction of Boolean variables which acquire a constant value in all optimal assignments. The emergence of 
such over-constrained variables at ac(2+p) appears to be continuous when p < pq and becomes strongly discontinuous 
above pq. The existence of a 0{N) backbone of over-constrained variables at the threshold above the tricritical point 
has deep consequences. Indeed, a common search algorithm such as the Davis-Putnam procedure will fail with finite 
probability to correctly assert the first variable and will waste much time in exploring empty branches of the search 
tree before backtracking and correcting the early mistake. Numerical experiments strongly support this feeling: at 
the threshold ac{2 + p), the running time to solve an instance of the 2 -|-p-SAT problem behaves polynomially with 
N for p < 0.4 and exponentially for p > 0.6 

A further understanding of the SAT-UNSAT transition undoubtedly requires a deeper knowledge of the organization 
of the optimal assignments. Information about the mutual (Hamming) distance between solutions, the size of the 
backbone, etc. is indeed of high relevance to understand and hopefully improve the efficiency of algorithms. From 
a statistical physics point of view, the main difficulty stems from the fact that K-SAT is naturally mapped onto a 
disordered spin model with finite connectivity. Although the lack of geometrical correlation in the clauses makes this 
model mean-field, the finite number of neighbours to each spin results in much stronger local field fluctuations and 
the theory is not as simple as its infinite-connectivity counterpart. Previous studies have shown that even at the 
simplest replica symmetric (RS) level, the order parameter describing finite-connectivity spin-glasses turns out to be 
a full distribution of effective fields [Q. Its determination requires to solve a functional self-consistent equation and 
is far from being easy. The situation becomes even worse and apparently mathematically intractable (except in some 
very peculiar cases [pi) when replica symmetry breaking (RSB) effects are taken into account. 

To circumvent the difficulty of solving the RS or RSB self-consistent equations, we propose in this article a different 
strategy. Our claim is that a variational approach is of high efficiency to provide very precise results at a bearable 
calculation cost. Using some elementary information about the gross physical features of the K-SAT model, e.g. the 
existence of a backbone, we show that a RS variational calculation is able to recover all known results and to predict 
new ones (under certain assumptions) such as the value of the tricritical point po = 2/5. In addition, we present some 
new results obtained from RSB variational calculations in both SAT and UNSAT regimes that unveil the structure 
of optimal assignments in the K-SAT problem. This paper is organised as follows. In Section II, we recall the main 
steps of the statistical mechanics approach to the K-SAT problem . We then explain the variational procedure 
to be followed depending on the particular phase, SAT or UNSAT, under investigation. Section III is devoted to 
the analysis of the structure of optimal configurations in the SAT phase. The SAT-UNSAT transition is studied in 
Section IV. For both Sections III and IV, we first focus on the replica symmetric variational solution and then expose 
the additional features corresponding to replica symmetry breaking effects. Finally, the emerging picture of the space 
of solutions is summed up and some perspectives may be found in Section V. 

II. STATISTICAL MECHANICS AND VARIATIONAL APPROACH 

A. Replica formalism and free-energy functional 

In this section, we shall give an overview of the statistical mechanics approach to random ii'-Satisfiability problems, 
see |9|Jll|| for the original works on this subject. We adopt the Ising-spin notion, so a true Boolean variable is mapped 
onto Si = -1-1, whereas a false variable gives Si = —1. A logical assignment {S} is a set of N spins Si out of all 
2^ possible configurations. We denote the (random) set of clauses by {C}. We choose the energy-cost function 
7i[{C},{S'}] to be the number of clauses violated by the configuration {S'} ||^. If the ground state energy is zero 
(respectively strictly positive), the logical clauses are satisfiable (resp. unsatisfiablc). The free-energy density / of 
the resulting spin system at a formal temperature T is given by the logarithm of the partition function 

Z[{C}]=J2^M-mC},{S}]/T) , (1) 

{S} 

and is assumed to be self-averaging as the size N of the instance of the if-SAT problem goes to infinity. In order 
to calculate the disorder average, the replica trick is used: 

hTZ = fim dn'Z^ (2) 

TI-.0 

where at first a positive integer number n is considered, and the replica limit n is achieved by some kind of 
analytical continuation in n. Introducing the 2" order parameters c{a) equal to the fractions of "sites" i such that 
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cr"" = 5*° , Va = 1, . . . , n the thermodynamic hmit of the free-energy density can be calculated by the saddle point 
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through a maximization over all normalized - '^ffc{(T) = 1 - and even - c(— (?) = c{a) - order parameters [13|. 
Eventually, the ground state properties are obtained as the temperature T = 1//3 is sent to zero in (||). Following [11 1, 
the first (respectively second) term on the r.h.s. of (||) will be hereafter called the effective entropy (resp. effective 
energy) contribution to the free-energy. 



B. Simplest order parameter and replica symmetry 

Finding the saddle-point c((t) of (H) is in general a very hard task. Since the functional (^) is invariant under 
permutations of the n replicas, it is possible to restrict the variational problem to the subspace of c{a) with the same 
permutation symmetry. In this so-called replica symmetric (RS) subspace, c((t) depends on the argument a only 
through ^a- This allows the introduction of the generating function P{h): 
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The normalization of c{a) implies the normalization of the generating function, JJ*^ dhP{h) 
into (0), one can easily obtain the analytical continuation in n, finally getting pJ| 



1 . Plugging this form 
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where Pft{v) = J dh e~''^''P(/i) denotes the Fourier transform of the generating function P{h). The free energy (j^) 
now has to be optimized with respect to P{h). 

To understand the physics hidden in this approach, it is useful to consider the Boolean magnetizations rrii —<ti Si 
where ^ . ^ denotes the Gibbs average with Hamiltonian Ti, at fixed disorder {C}, see Section II. A . An effective 
field hi is associated to each local magnetization rrii through the relation rui = tanh(/3/ii). Within the RS framework, 
the order parameter P{h) is simply the histogram of the effective fields, 
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where the overbar denotes the average over the random choices of clauses {C}. At very low temperature, effective 
fields are related to elementary excitations around ground state configurations. 



C. More sophisticated order parameters and replica symmetry breaking 

A corollary of Ansatz (^) is that the Hamming distance d between any two assignments (i.e. the number of variables 
which are different in the two configurations) weighted with the Gibbs measure almost surely equals 

drs^\-\j dh P{h) (tanh/3/i)' , (7) 

once divided by N. In other words, on the A^-dimensional hypercube whose vertices are the Boolean configurations, 
all relevant assignments belong to a single cluster of typical diameter d^s ■ N. On general grounds, there is no a priori 
reason to trust this simple picture. At zero temperature for instance, there could well exist a non trivial geometrical 
organization of the space of solutions to the SAT or MAX-SAT problem which would give rise to a non-trivial {i.e. 
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not fully concentrated) probability distribution for d. The simplest and immediate extension of (|^) corresponds to 
a bimodal distribution for d, with two peaks in do and di(< do). The corresponding picture on the hypercube of 
configurations is that solutions are now gathered into different clusters having average internal diameter c?i • N and 
being separated by a typical distance do ■ 

Let us label these clusters, also called pure states by a new index F. In a given cluster F, the magnetizations 
mf can be calculated as the average values of the spins Si over the Afr assignments belonging to F. As before, it is 
convenient to consider the effective fields /if through the relations mf — tanh(/?/if ). These effective fields fluctuate 

• from state to state: For a given site i, the effective fields depend on the cluster F. We introduce the histogram 
PiW — X^r-^r S{h — /iH/X^r-^r to take these fiuctuations into account. 

• from spin to spin: In turn, pi{h) explicitely depends upon the index i of the variable it is related to. This 
multipficity of field histograms is encoded in a functional probability distribution P [p] — J2i=i ^[pW ~ Pi{h)]/N 
over the set of possible p{h). 

Within the replica formalism exposed in Section II. A, the above picture corresponds to the first step of Parisi's 
hierarchical replica symmetry breaking (RSB) scheme [|l3|. The RSB order parameter c{a) reads jl^] 

c{a) = / vpvip] n / dhpih) n ^pjrr-r^ • (») 

With the above Ansatz, the analytical continuation n can be performed and the resulting free-energy (|^) has to 
be optimized over P[p{h)], see The parameter m in (^) determines the relative importances of dp and di pl] , p^ . 



D. Variational approach 

The direct way to complete the calculation of free energy (^) within the replica symmetric or the one-step broken 
approximation would be the following: a variation of with respect to the order parameters yields functional 
equations for P{h) or V[p{h)]. In the replica symmetric case, this equation could be solved in by a class of 
distributions consisting of a larger and larger number of Dirac peaks. In the replica symmetry broken case, only 
the very simplest possible solution could be obtained in | [Tl| . The evaluation of any more involved solution seems a 
hopeless task due to the complexity of the saddle point equations . 

In this paper a different route will be chosen |l5[. Based on physical grounds, some simple trial functions for P{h) 
or V[p{h)] will be proposed. These functions only depend on a small number of parameters; this fact significantly 
reduces the complexity of the problem. In the replica symmetric case, the exact results of [ |To[ can be reproduced 
within a precision of less then one percent. In the replica symmetry broken case, new results can be obtained which 
are far beyond the solution given in 



E. Zero temperature limit and scaling of the effective fields 

The phase transition in K-SAT separates a low a regime in which all variables are typically under-constrained (SAT 
regime) from a high a regime in which a finite fraction of variables is typically over-constrained (UNSAT regime) . 

Variables can be under-constrained when they do not appear in any clauses, or more generally when the minimal 
number of violated clauses is independent of their possible assignments (true or false) . In the language of statistical 
physics, such under-constrained variables correspond at low temperature T to spins Si submitted to effective fields hi 
vanishing linearly with T: hi — T.Zi. This way, their magnetizations — tanh(/3/ii) = tanhz^ are different from ±1 
in the ground state. These unfrozen spins do not contribute to the energy when T — > 0, but only to the entropy. In 
the SAT phase, effective fields are expected to show this behavior for low temperatures. 

Conversely, over-constrained variable correspond to spins Si seeing effective fields hi that remain finite, i.e. of the 
order of one in the zero temperature limit. The excitation energy to flip any of these spins Si is flnite and the spins are 
frozen in up or down directions depending on the signs of the associated flelds hi. In the zero temperature limit the 
only contribution to the energy comes from these frozen spins. To study the SAT-UNSAT transition, one therefore 
has to focus onto the probability distribution of effective fields on the scale of 0(1). Note that on this scale, the 
effective fields corresponding to unfrozen spins vanish and give rise to a Dirac peak centered at zero; the weight of 
this = peak is precisely the fraction of under-constrained spins. 



4 



III. THE SATISFIABLE PHASE 



A. Replica symmetric approximation 



As already discussed in the previous section, the interesting quantity to be calculated in the satisfiable phase 
(a < ttc) is the ground state entropy density s — — lim^^oo /3/ in the satisfiable phase (a < etc). In the replica 
symmetric approximation, the entropy s reads, according to equation (p|). 
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where z = (3h\s the rescaled effective field of order one, see Section II. E. As a consequence, the distribution P{z) has 
a finite mean and variance in the limit /3 — > 0. As in the previous Section, Pft{v) denotes the Fourier transform of P. 
We start with a simple Gaussian Ansatz for the rescaled field distribution. 



(10) 



where Ga(^) denotes a Gaussian distribution with zero mean and variance A. Note that P{z) is expected to be even 
due to the symmetry of the disorder distribution. In the case of infinite connectivity spin glasses eq. (|l^) would 
give the exact distribution, cf. ||l^, but due to the finite connectivities in X-SAT effective fields are not necessarily 
Gaussianly distributed. We find the expression 
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which has to be optimized numerically with respect to the variational parameter A. Hereafter, Dz — Gi{z) dz denotes 
the Gaussian measure with zero mean and variance one. For a — Q the variational parameter is found to be A = 0, 
and the entropy follows to be Srs = ln2: there are no clauses and the solution space coincides with the full phase 
space of the model. For increasing a, the entropy diminishes due to the growing number of constraints, see figure 1 . 
Our results are practically indistinguishable from the exact expansion of Srs in powers of a performed in . In fig, 1 , 
we show the typical Hamming distance drs between two solutions, drs monotonously decreases from d — 0.5 at 
a = 0. This behavior signals a concentration of the solutions in configuration space. 
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FIG. 1. Variational entropy s of the solutions for the 2-SAT (bold dashed line) and the 3-SAT (full line) problems as functions 
of the ratio of clauses per variable a. The curves are practically indistinguishable from the full RS results in Q. The vertical 
dashed lines indicate the threshold ac(2) — 1 and Qc(3) ~ 4.27 li. 
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FIG. 2. Typical Hamming distance between two solutions of a random 3-SAT problem. Whereas there is only one distance 
d in the replica symmetric phase < a < ~ 3.96 which is monotonously decreasing with the number of clauses per variables 
a, we find two characteristic lengths in the replica symmetry broken case as < a < Oc — 4.25 — 4.30 ^j^. The Hamming 
distance do between two clusters of solutions (upper line) remains almost constant with a, whereas the entropy loss is mostly 
due to a shrinking of the typical size d\ of the clusters themselves (lower curve). The dashed line denotes the continuation of 
the replica symmetric result. 

In order to test the robustness of the results obtained with help of Ansatz (|l^) we have repeated the above calculation 
with an exponential Ansatz for P{z). The values of the entropy were changed by less than 1%. We have also taken 
into account the presence of free spins through the Ansatz 

F(z) = (1- A) (5(z) + AGaW (12) 

for the rescaled distribution. The Dirac peak accounts for the variables which are not contained in the clauses as 
well as for spins having an effective field going to zero faster than linearly with the formal temperature T . Therefore 
1 — exp(— aA'), i.e. the fraction of variables not appearing in any clause, constitutes a rigorous upper bound for A 
which was explicitely violated by the Ansatze considered so far. For 2-SAT at a = 1, we find A = 0.57 ± 0.01 (and 
0.71 ± 0.02 if the Gaussian in ([l^ ) is again replaced by an exponential distribution), which has to be compared to the 
bound 1 — exp(— 2) — 0.865. The strong dependence of A on the non-zero field part of the Ansatz probably results 
from the inclusion of small but non-zero rescaled fields into the Dirac peak. For 3-SAT, differences are less drastic: 
whereas the upper bound is almost 1, the above Ansatz gives A = 0.94 ±0.02 (and an A numerically indistinguishable 
from 1 for the exponential distribution) at a = 4. In all cases, all these variations affect the entropy value by 1% at 
the most. 



B. The replica symmetry breaking transition 

We have already underlined in section II. C that the replica symmetric Ansatz is unable to reflect any non-trivial 
organization of the optimal assignments space. To investigate the ground state structure of if-SAT, we thus consider 
a one-step repHca symmetry broken (RSB) Ansatz. According to the discussion of Section II. C, we choose 
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which coincides with the exact one-step expression for infinite-connectivity spin glass models. As in the RS hypothesis 
z = ph is a rescaled field which remains of order one in the zero-temperature limit. The detailed calculation of 
the variational RSB ground state entropy Srst in the satisfiable phase is exposed in appendix A. The result reads 
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This quantity has to be optimized with respect to the variational parameters Aq, Ai and m. The numerical problem 
in calculating the solutions of the three equations 
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for 3-SAT consists in the sixfold integration in the first term of (^4|). It is much easier to determine the critical as 
where the first nontrivial solution of these equations can be found. In principle, due to the continuity of the entropy 
Srsb at this transition, there are two possible scenarios |p^ , 

• A continuous transition in Aq — > A^s, Ai —^ takes place, where A^s here and in the following denotes the 
replica symmetric value. This could be connected to a nontrivial with < rus < 1. 

• A jump in Ai towards a nontrivial value > takes place at the transition. To guarantee the continuity of Srsb, 
such a transition has to happen at m = 1 and Aq = A^g. 

In the following two subsections we will consider both possibilities. Whereas a transition of the first type can be found 
at a certain as, the second possibility can be ruled out. The value of as found this way constitutes an upper bound 
for the exact threshold. We can indeed not exclude that, by taking into account a larger variety of density functionals 
(O), a non-trivial RSB solution could appear already at smaller values of a. 



1. The continuous transition 



In order to determine the critical value as for the replica symmetry breaking transition inside the SAT phase, we 
have to explicitly use equation dla). 



dSrsb 



, (16) 



and expand it to first order in Ai. As a result of the expansion, the interior integrals over the zi in ( p^ can be carried 
out analytically, leaving only three integral to be evaluated numerically. At the zeroth order, the replica symmetric 
saddle point equation for Aq is recovered. At the first order, the coefficient of the linear term in Ai vanishes at 

as = 3.955 ± 0.005, (17) 

thus allowing a non-zero solution for Ai to develop. This value is in surprisingly good agreement with a critical 



slowing down found numerically by Svenson and Nordahl 17 1. They considered a simple zero-temperature Glauber 



dynamics for random satisfiability and coloring problems. In the case of 3-SAT this dynamics showed an exponential 
relaxation down to (almost) zero energy density for a < 4, whereas the relaxation became algebraic for a > 4 - 
converging towards non-zero energy. As we shall see in Section III.C, the increase of Ai at as coincides with the 
emergence of a non trivial structure of the optimal assigments of a typical 3-SAT instance. Due to the continuous 
nature of the transition at a^ it is probable that higher-lying, metastable states blow up simultaneously with the 
breaking up of the ground state structure. If it were so, the relationship between Svenson's and Nordahl's result and 
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the static transition at Us could perhaps be explained along the lines developed in the context of the off-equilibrium 
dynamics of spin-glasses 18 1. 

In order to calculate also TOs at the transition, we have to take into account either the second order terms of 
equations (16) slightly above Us, or to explicitely solve the full saddle point equations ( |l5| ) in the limit a af. We 
have followed the second route | [T9[ | and found « 0.8. The corrections to the entropy of solutions is however very 
weak, Srs — 0.917 while Srst = 0.911 at a = 4.2. 

For 2-SAT, no such transition can be found before the SAT-UNSAT-transition. A numerical investigation of the 
2 -|- p-SAT model makes us conjecture that the existence of a replica symmetry breaking transition within the SAT 
phase is related to the appearance of a discontinuous SAT-UNSAT transition, see Section IV. 



2. Nonexistence of a discontinuous transition 

As already discussed above, one could also imagine a discontinuous transition with a jump in Ai even at a lower 
value of a. Due to the continuity of the ground-state entropy, we would expect this transition to be continuous in to, 
i.e. to happen at m = 1. 

The most interesting saddle point equation to be considered here is hence the TO-equation. Exactly at the transition, 
a non-zero solution for Ai should show up in the equation 

0=^(to=1,Ao-A,„Ai) . (18) 

The other two variational equations ( p^ are automatically fulfilled at this point. In figure we display dmSrsbi'm = 
1, Ao = Ar-s, Ai) for several values of a as a function of Ai. For a < as, i.e. below the continuous transition found 
in the last subsection, the function is monotonously decreasing with Ai, towards some finite asymptotic value. This 
clearly rules out any discontinuous transition in Ai - the only zero of this function lies at Ai = 0. At the continuous 
transition the behavior in the vicinity of Ai = changes. The sign of the second derivative of dmSrsb with respect 
to Ai changes whereas the first derivative is always zero. This confirms again the local instability of the replica 
symmetric solution leading to the continuous transition found in the previous paragraph. 




FIG. 3. dmSrsbim = 1, Aq = Ars, Ai) as function of Ai. The figure shows the curves for a = 3.7,3.8,3.9,4.0,4.1 (from 
bottom to top), illustrating the non-existence of a discontinuous transition. Inset: the second derivative d%^{dmSrsb) vanishes 
at as = 3.955 ± 0.005. 
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C. Multiplicity of clusters 



In this section, we give a geometrical interpretation of the RSB transition. Above as ~ 3.96, the ground state 
configurations are divided into an exponential number of well separated clusters. We are interested in the distribution 
of the entropy densities s of these clusters, i.e. we want to count the clusters containing ~ e^" satisfying assignments. 
This number will be denoted by e^"*^*^ . The quantity uj{s) is hereafter referred to as the multiplicity of s. By definition 

of U!. 

"dse~"(^)(e^^)" (19) 
= 1™ E f E e-^«[{^}^{^}I^ , (20) 

where F denotes the clusters and Ti. the energy cost function of Section II. A. m is now a control parameter that can 
be varied to obtain uj{s). A straightforward calculation of ([l9| ) show that r(m) — log Z^/N is simply the Legendre 
transform of lu{s) in the large TV limit, 

T{m) = lim — log Z„i = extrs[a;(s) + ms] . (21) 

N^oc IV 

Thus to access the multiplicity uj, we resort to the calculation of r, following closely the lines of |Q (see also ]2l| ] 
and for related calculations on the p— spin glass model and neural networks). We use again the replica trick, 
InZm = linin^o f^n(-^m)"j sind represent also the m-th power in ( pO| ) by a positive integer- valued m. This leads to 
n.m replicas of the original system obeying the one-step RSB algebra [Q. Within our Gaussian variational scheme, 
we easily find 

r(m) = m extrAo.Ai [srsbim, Aq, Ai)] (22) 

where Srsb is given by ([T^). Consequently, the dominant clusters considered so far and obtained by optimization of 
Srsb over m have zero multiplicity uj — —m?dsrsb/ dm = 0, i.e. their number is less than exponential in the system 
size N . Simultaneously, there exist exponentially numerous clusters with lower entropies such that the total number 
of satisfying assigments they contain remains much smaller than e^"^''' . These subdominant states appear as soon as 
a gets larger than Ug. Below this transition there is no positive multiplicity at all; almost all solutions are collected 
in one large cluster. 

In figure ^ we show the results for a = 4.2 (we have checked the qualitative similarity of the curves for other values 
of a) obtained through a numerical solution of the variational equations in Aq and Ai. At a certain s — Srsb — 0.911 
the multiplicity becomes positive, and the curve starts with slope —rrirsbi where mrsb — 0.72 is the value of m that 
optimizes Srsb- At m = 0, i.e. where the slope of u over s vanishes, we find again the replica symmetric entropy 
Srs — 0.917 calculated at the beginning of this Section. However, the curve is only rehable up to the cusp: There, the 



second derivative (Psrsb/dw? changes sign and the corresponding variational solution becomes unstable |23|. Note 
also that the RS entropy Srs = is found back at m = 1 in the unphysical negative oj region. As happens also in 
the case of the spherical p— spin glass, there could already be an instability of the one-step solution due to instable 
replicon modes 124 , but the deviations from the given curve are expected to be very weak. 
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FIG. 4. Multiplicity u>{s) of states with entropy s at q = 4.2. The curve intersects the zero multiplicity axis at Srst — 0.911 
and Srs — 0.917. The full line shows the reliable part of the curve. Along the dashed line, the second derivative of Srsb is positive 
and the one-step replica symmetry broken Ansatz is not longer valid. The curves for other values of a > Qs are qualitatively 
similar. 

Although the order of magnitude of the multiphcity calculated above is small, some drastic changes take place at 
as- This can best seen on the typical Hamming distances between solutions. The quantity 

_ 1 _ 1 f J DicoslTiy^z + y/ATz) tanh^(VA;;z + y/A^z) 
'^2 2]^' jDi cosh™ iVA^z + ^ ^ 

describes the average distance between two solutions inside the same cluster, whereas 

stands for the distance between two clusters. The results for the thermodynamically dominant states are shown in 
figure ^. We observe that do is almost constant in a, i.e. the relative positions of the clusters in configuration space 
remain roughly unchanged if new clauses are added to a given sample. In contrast to this behavior, the disappearance 
of solutions as a grows is accompanied with a rapid decrease of the cluster diameter di . 



J Dzcosh (vAqz + vAiz) tanh(vAoz + VAiz) 
f DS coah"'(^/\;z + y/K[z) 



(24) 



D. Breakdown of the scaling of the effective fields 



All the Ansatze we have used in the description of the SAT phase were based on the assumption of effective 
fields linearly vanishing in the zero-temperature limit. As we have argued, this scaling is no longer valid above the 
SAT-UNSAT transition. So we can extract a first variational estimate of the critical value Uc from the divergence 
of A in the replica symmetric and of Ai in the replica symmetry broken case. Compared with the numerical value 
Qfc = 4.25 — 4.3 the resulting values 4.76 (RS), resp. 4.66 (RSB) are rather crude approximation. The replica 

symmetric value can already be improved by taking ( p^ ) instead of (pO|). We find ac — 4.622 with A — 0.935, which 
is rather similar to the iterative replica symmetric result in |9|. As we shall show in the next Section, physically more 
elaborate approximations are needed to obtained better results for 



IV. THE SAT-UNSAT TRANSITION 
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A. Replica symmetric calculation 



1. Variational RS free-energy 

According to Sections II. B and II. E, we propose the following variational (replica symmetric) field distribution in 
the zero temperature limit 

P{h) = (1 - B)6{h) + ^ $ (-^ . (25) 



A VVA 

^(x) is an even and decreasing probability distribution with argument x — 0{1). B denotes the fraction of frozen 
spins and A {— 0(1) when T ^ 0) the typical squared magnitude of the effective fields acting on frozen spins. 

Once ^(x) has been chosen, we plug the trial variational function ( p5|) into (^. The resulting variational prob- 
lem involves two parameters B and A only and is therefore considerably simpler than the initial one. In the zero 
temperature limit we obtain from (||), 

f + 00 



/.4B,A,a,p) = -2%/A - / —'f'ftiiy)ln[l-B + B<i>ftiiy)]- 

K'^ Jo V 

^l/(2\/A) \ 

a d/i|(l-p)BM*cc(/i)]'+pSM*cc(/j)]'}J 



(26) 



where 



/ + 00 
dx e-'^" $(x) (27) 
-OO 



' — OO 



$cc(/i)= / dx^{x) 



(28) 



are respectively the Fourier transform and the complementary cumulative function of The above free-energy (|2q ) 
corresponds to the 2-|-p-SAT problem which smoothly interpolates between 2-SAT {p = 0) and 3-SAT {p = 1), cf. 
section I. For the sake of completeness, we give in Appendix B the derivation of ( p6| ) for the special case of a Gaussian 
distribution $(x) = Gi{x). 

As we shall show below, within this simplified version of the variational problem it is possible to obtain results 
which are only slightly different from the ones obtained from the best replica symmetric solution |^. The simplicity 
of this approach leads to a more transparent description of the SAT-UNSAT transition. 



2. A smooth transition: the 2-SAT problem 

We start with p = 0,i.e. the 2-SAT case. From previous numerical and analytical studies, it is known that the 
fraction of frozen spins is continuous at the transition and is zero for a < ac Actually a numerical analysis of 

frs{B, A,a,0) excludes the possibility of a first order transition in B and A. To locate the critical value of a, we 
expand frs{B, A, a, 0) around B = and A = 0. To the leading order and neglecting irrelevant terms in A, we find 

frs{B,A,a,0) -2 VA f(l\a)+B^ f^f) (29) 



where 



-$'/,(^)[<i'/.(^)-l]-« / dh[^,,{hf (30) 







-1 /' + 00 J 

•^"=-^i, ^Wi^M'^)-^]' ■ (31) 

frf^ is clearly positive. Therefore the maximum of /,., is located at B = if f^Via) > and at B > if f^'i\a) < 0. 

It is easy to demonstrate that the threshold ac, determined through the condition frs (uc) = is always equal to 
unity independently of the choice of the probability distribution To do so, we rewrite the first term on the r.h.s. of 
(|^), that is using the definition (H) of $/t, 
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f^f{^) = - dxdy X <^{x)^{y)w{x,y) (32) 

J —OO 

with 

H^, y)^Tr -e--(e-'' - 1) = -sign(.x + y) - -sign(x) . (33) 

27r 2 2 

Inserting ( ^3|) in (|3^), a simple calculation leads to 

/•OO 

/(2)(0)= / dM<i>cc(/i)]' , (34) 
Jo 

and therefore to the reported result etc = 1. Surprisingly, the variational RS calculation is able to recover the exact 
threshold of 2-SAT Q in a very robust manner. Note however, that the continuous growth of the backbone B above 
ac depends on the choice of $. 



3. A discontinuous transition: the 3- SAT problem, 

We now focus on the 3-SAT case (p = 1). Previous numerical and analytical studies have shown that spins freeze 
discontinuously at the transition Thus, we cannot locate the threshold through an expansion of frs{B, A, a, 1) 

as in the 2-SAT case. The full variational calculation can nevertheless be simplified due to the following observation. 
In the SAT phase (5 = 0), the free-energy is identically zero. Within a first-order transition scenario, the threshold 
will be the value of a at which the free energy of the UNSAT phase {B ^ 0, A y^O) changes sign to become 
thermodynamically stable. The calculation of ac becomes simpler once the free-energy (pq) is rewritten as 



frs{B, A, a, 1) = 2 VA - s,,(B) + a e,,(A)^ 



(35) 



with 



1 />-\-OC 1 

^rs{B)^—^ —^'f,{^)\n[l-B + B<i>ft{,^)] (36) 

.l/(2v^) 

er.(A) = / dh\^,,Xh)f . (37) 

Calling Be (respectively Ac) the argument where s^s (resp. e^s) reaches its minimal (resp. maximal) value, we obtain 
from (|3^,^) the following expression of the threshold 

WniBSrs{B) SrsiBc) 

maxAers(A) ers(Ac) 

The maximum of e^s is obviously located at A^ = whereas the precise value of B^. depends of the field distribution 
$. We list below the results obtained for three different choices. 

• Gaussian distribution: <I>(x) = Gi{x) , Be — 0.935 , ac — 4.622 . 

• Exponential distribution: $(x) — ^e"'^' , Be — 0.976 , ac — 4.617 . 

• Lorentzian distribution: '^{x) — ^ (i+x^) i — 0.986 , ac — 4.983 . 

Above the threshold, B and A both increase with a from their critical values Be and Ac(= 0). The variational 
approach is thus able to reproduce the qualitative picture of the mixed nature of the phase transition (second order 
in A and first order in the backbone size B) that emerged from the iterative RS solution |^ and numerics Q. This 
prediction is quite robust with respect to the choice of ^(x). Even a Lorentzian distribution gives reasonable results 
for ac and Be though its large-field tail is not physically sensible, see Section V.B. 

From a quantitative point of view, the above results for the Gaussian and the exponential case differ from the 
iterative RS solution Be — 0.94, ac ~ 4.60 ||] by a few percent only. However, the latter was derived through a much 
less convenient iterative scheme M ■ 
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4- The tricritical point po 



As we have seen above, the main difference between 2-SAT and 3-SAT Ues in the behaviour of the fraction of frozen 
spins at the transition. In other words, the backbone size at threshold vanishes in the former case {Bc{p = 0) = 0), 
while it exhibits a discontinuous jump in the latter case {B(.{p = 1) > 0). It is natural to expect the existence of a 
tricritical point po separating continuous SAT-UNSAT transitions (p < po) from discontinuous ones {p> po) [| 10|. 



When p < pq, the transition can be studied through an expansion of the free-energy (^6|) in powers of the backbone 
size, see (E9h, 



frs{B,A,a,p) ~ -2 \/A /(2)(a,p) f(l\a,p)) (39) 

where, using (32 3^.[3^), 

poo 

/(2)(a,p) = (l-(l-p)a) / dh[^Uh)f (40) 

Jo 

fif{a,p)^--J^ ^*'/*(^)[$/t(^)-l]'-apy^ dh[^,4h)]' . (41) 

As long as frf' remains positive, the threshold is situated at ac{2 + p) = 1/(1 — p) (^0|). As in the 2-SAT case, this 
result does not depend on the distribution $ in ( p5| ) and coincides with the rigorous result found in |^ for p < | ■ 
At a given p and slightly above the threshold, the backbone size scales as 

, (42) 

fys\ac{2+p),p) 

up to a constant multiplicative factor. The tricritical point po can thus be found through the condition frs {oic(2 + 
p),p) — 0. This statement remains unaffected by the inclusion of higher order terms in A in the expansion (py). 
The corresponding values of po for the three choices of $ of the previous paragraph are: po ~ 0.437 for the Gaussian 
distribution, po = 3/7 ~ 0.429 for the exponential distribution and po ~ 0.418 for the Lorentzian distribution. As 
expected, these values are slightly higher than the prediction of the iterative RS solution, 0.4 < po < 0.416 |lO|| . 

We shall now show under some assumptions exposed in Appendix C that the tricritical point is precisely located 
at Po = 2/5. To do so, we proceed in two steps. Firstly, we recall that the equality adp) = 1/(1 — p) for p < 0.4 
has been rigorously demonstrated in Secondly, using the RS variational approach, we have seen above that 

— (3) 

ac{2+p) = 1/(1 —p) up to a tricritical Po which depends of <i> through the condition /rs (1/(1 — po)iPo) =0. Consider 
now two different Ansatze and <i>^^'' such that the corresponding tricritical points satisfy Pq^-* < p^^K Then, for 
p in the range p^^ < p < p'"^\ we have ai^^(2 -f p) — 1/(1 — p) by definition of Pq^^ and a^c\'2. + p) < 1/(1 — p) 
(the 2-clauses part of the formula is almost surely satisfiable if and only if a • (1 — p) < Q!c(2) — 1 giving thus this 
upper bound to the threshold, see Q). Let us choose a with a'c\p) < a < a'c\p)- For Ansatz 2, the free energy 

frs\a) vanishes while within Ansatz 1, frl\a) > 0. Since the free-energy has to be maximized (see Section II. A), the 
first Ansatz has to be preferred to the second one. Consequently, po has to be minimized over the choice of possible 
distributions $ and an upper bound to the true value of po is provided by the minimal value of po within the RS 
variational calculation. We show in Appendix ^ that the latter already equals 2/5. 



5. Comments on the variational RS calculation 



The main ingredient into our trial variational function is the separation between the effective fields of order one 
seen by frozen spins and the fields of order T = 1/ f3 acting on unfrozen spins. The crucial importance of this 
separation can be easily seen a posteriori with a simple Gaussian Ansatz for P{h), which amounts to set B to unity 
in (26). In this case, one finds ttc — 4.76 for 3-SAT, a rather high value. For 2-SAT, the situation is even worse: the 
predicted value for the threshold, ac — 1-7 is totally wrong while the correct value ac = 1 was successfully obtained 
by optimizing over B. This result is not surprising: slightly above ac, there are only few constrained spins whereas 
the Gaussian Ansatz with B = 1 and A > amounts to consider that all spins are frozen. 

Besides its technical simplicity, the variational calculation provides a better understanding of the transition. While 
the iterative RS scheme used in H was rather involved and the resulting shape of the field distribution remained 
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unclear, the two-parameter variational theory presented here stresses unambiguously the mixed nature of the SAT- 
UNSAT transition |]: of first order with respect to the backbone size B and continuous with respect to the intensity 
A of the effective fields related to excited configurations. However, from a quantitative point of view, the predicted 
value of Be is much larger than the numerical result Q . This discrepancy stems from an intrinsic weakness of replica 
symmetry, which is unable to distinguish between different kinds of frozen spins (belonging or not to the backbone). 
As a result, the parameter B obtained from the variational RS calculation takes into account all frozen variables 
and thus overestimates the backbone size. We shall see in next Section how replica symmetry breaking has to be 
introduced to solve this problem. 



B. Replica symmetry breaking calculation 

The variational calculation exposed in Section IV.A as well as the iterative replica symmetric solution of [p|jic|] 
provide qualitative insights into the physical features of the SAT-UNSAT transition. From a quantitative point of 
view, the RS Ansatz however fails to predict accurately the threshold of 3-SAT: the estimate ac — 4.6 lies above 
numerical findings ac — 4.25 — 4.30_[p|j^,^. This by itself indicates that a replica symmetry broken (RSB) theory of 
if-SAT has to be sought for |pT|,^,|loP Another strong hint is of course the appearance of RSB in the ground state 
structure already in the SAT phase, as discussed in sections III.B and III.C. 



1. Structure of the RSB field distributions 



A major qualitative weakness of the RS Ansatz underlined in paragraph IV. A. 5 lies in its inability to distinguish 
spins frozen always in the same direction (backbone) from spins frozen up and down depending on the particular 
ground state cluster. For this reason, the RS analysis can only predict that the fraction of frozen spins is ^ 0.93 but 
does not tell us the size of the subset that truly belongs to the backbone of solutions. 

Results on the backbone can be obtained within the replica symmetry breaking analysis. In the latter, the backbone 
may be defined as the fraction of frozen spins Si which do not change direction from cluster to cluster. The distribution 
of effective fields pi{h) of such a spin has its whole support on the positive (or the negative) semi-axis only, see Section 
II. C. Conversely, frozen spins that do not belong to the backbone can fluctuate from state to state: their corresponding 
probability distribution pi{h) may extend over the entire real axis. 

On the basis of the previous considerations, we propose the following (one step) RSB variational Ansatz, 



V[p{h)] = {1 - Bi - Bo) S 



p{h) - S{h) 



■•+00 

+ Bo / dhMh) S 



p{h) - ipQ{h,h) 



-Bi 



dh (t>i{h) S 



p{h) - il^i{h, h) 



(43) 



In the above expression, 5[.] denotes a functional Dirac distribution ||ll|]. The first term on the r.h.s. of ([43| ) is 
the contribution due to unfrozen spins, whereas the two other terms include two kinds of frozen spins. Bq gives 
the fraction of variables in the backbone. -00 (^: h) denotes the distribution of the effective fields h at one site while 
fluctuations from site to site are taken into account through h and the distribution 4)o{h) [|l^,|ri|,^. Thus, at fixed 
h, the distribution ipoih, h) of h has a support on the semi axis having the same sign as h. The last term of ( p3| ) is 
associated to frozen spins not belonging to the backbone. The effective field distribution ipi{h^ h) has therefore no a 
priori restriction on the sign of h. 

To obtain mathematically tractable expressions, we have made the following choices for the above field distributions: 

il;o{h,h) =5{h-h) 

Mh,h) ^GAAh~h) . (44) 

The arbitrary choice for (/>i simply means that close to the transition the typical value of h is much smaller than /I's 
one. Indeed, as in the RS case, effective fields acting on frozen spins are expected to vanish at the transition (coming 
from the high a phase) . Ansatz (|4|) is the simplest one compatible with the sign restriction on the support of -00 and 
the unbiased distribution of literals in clauses imposing P[p{h)] — P[p{—h)]. 
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2. Analysis of the transition for 3-SAT 



The trial variational function (^3|) with (Q) can be plugged into (^) and (^). As the temperature T is sent to zero, 
the resulting variational problem involves four parameters: Bi, Bq, r = ^Aq/Ai and fi — /SmVAi. The variances 
Ao and Ai of the fields vanish at the transition (see Appendix C) and enter the free-energy through the finite ratio 
r = Aq/Ai. Moreover, at very low temperatures, the breakpoint parameter m naturally behaves as 0{T). Since the 
number of states having an excess free energy F with respect to the lowest lying state scales as e'^™^ , /3m keeps 
finite when T = not to spoil RSB effects. Furthermore, to match the SAT phase (to = 0(1), i.e. f3m — oo ), f3m 
has to diverge at the threshold ac when coming from the RSB-UNSAT phase. This divergence makes /x = f3m^ Ai 
finite at the transition. 

The variational RSB free-energy is computed in Appendix D and written below, 

frsbiBi,Bo, r,fi) = - / dxdy L{x, y) [l - B^ - B^ + Sie"" + B^e-y] In [l - B^ - B^ + Bier'' + Boe"^] 
Jo 



^jB,B, I dxU 2mi,)-2,r,dye'^.ymiy) ' ^^^^ 



2«E 



g=0 

where L(x,y), that depends on r and is the double inverse Laplace transform of 



r+oo 

JCi^iy/^, firVb) = dxdy e-'^^-y L{x,y) , (46) 







with /C(a, b) defined as 



(47) 



Dy\n / Dxe\''''+^y\ 

-oo LJ — oo 

The function H{x) equals 

H{x) = Dye'^y . (48) 



To compute /rsb, we have expanded the logarithm in the first term of the r.h.s. of ( |45| ) in powers of Bi and Bq, using 
then the definition (^) of L to perform the integrations over x and y. The main difficulty with this procedure is that 
results obviously depend on the number j of terms considered in the series expansion, see Appendix D. In the simple 
case /i = 0, we have checked that the optimal (and j-dependent) free-energy frsbU) reaches its exact value frst with 
1/j^ corrections as j grows. In the general case ^ ^ 0, numerical results support this scahng: frsb{j) = frsb + 0{\/ p). 
For instance we show in the inset of figj| the threshold ac{j) versus 1/ p. Using this procedure, we have found that 
the SAT-UNSAT transition takes place at ac — 4.480 ± 0.003. This value is still higher than the numerical one, 
Qfc — 4.25 — 4.30 but definitely improves the RS result ac — 4.60 and lies below the best known rigorous upper 
bounds 1^. Moreover, at the transition we find Bq ~ 0.13±0.01, Bi ~ 0.79±0.01, ^l ~ 0.88±0.02, and r ~ 1.4±0.1. 

What is the meaning of ^? Consider at a given a > ac, the clusters corresponding to the ground state energy 
EQs^ i.e. with the minimal number = 0{N) of violated clauses. Higher-lying clusters F exist which make slightly 
more mistakes: -Br = Eqs + er with er = 0(1). Imagine now that we add c = 0(1) new clauses to the instance 
we have considered so far. Clearly, the previous ground state assignments are not necessarily optimal any longer and 
can be supplanted by configurations belonging to some clusters F such that er < c. Therefore, these clusters and the 
distribution of the related er are of interest to understand how an instance of iiT-SAT can adapt in response to some 
change in the constraints. From a more quantitative point of view, let us define the linear susceptibility xr as the 
(free-)energy change of the cluster F when the typical magnitude of the effective fields grows from zero up to -s/Ai, 
divided by VAi. Within the RSB variational approach, the number of quasi-optimal clusters having a susceptibility 
equal to X scales as 

AA(x) -e^"^ . (49) 
The above equation unveils the meaning of the parameter /i associated with the breaking of replica symmetry. 
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3. Comments on the replica symmetry breaking solutions 



As stressed in Section II. D, it is crucial to distinguish fields of the order of one from vanishing fields when T ^ 0. 
The importance of this separation for the RSB solution can be checked within the variational subspace Bi + Bq = 1 
in (p5|), that is discarding unfrozen spins. In this case, we find Uc — 4.66 and Bi = 1,Bq = 0. This result is 
quantitatively and qualitatively erroneous, because the value of Uc is even higher than the value predicted within 
the replica symmetric analysis and the fraction of spins belonging to the backbone is zero. The value of ac can be 
improved fixing Bq = and optimizing over Bi. In this case we find ac — 4.51, J5i ~ 0.925,/x ~ 0.8 and there is no 
backbone. 

The backbone is taken into account when relaxing the constraint Bq — 0. The corresponding variational calculation 
has been exposed in the previous paragraph. Let us briefly comment on the results. First of all, we note that the 
fraction of frozen spins Bi+Bq ~ 0.92 changes by a few percents with respect to the RS case. This value is quite robust 
and should be quantitatively correct [^. Conversely, the fraction of spins belonging to the backbone Bq ~ 0.13 is 
underestimated with respect to numerical findings [H , which predict a value of 0.4 for small instances. This probably 
stems from the choices of the field distributions ( [4^ ) which break replica symmetry for spins not belonging to the 
backbone only. Therefore in the variational treatment the latter are thermodynamically favoured and the computed 
fraction of spins belonging to the backbone is smaller than the true one. Breaking replica symmetry also for these 
spins would presumably permit to obtain better values for ac and Bq. This would however be a hard task due to the 
technical difficulties arising in the numerical computation. 

To strengthen this intuition, we consider the fi ^ limit of the RSB free-energy (^) which amounts to treating the 
two kinds of frozen spins on the same footing. The latter becomes simplified and corresponds to the RS free-energy 
obtained from the following RS field distribution. 



Pih) = (l-Bi-Bo) 5{h) + Bi- 



Bo- 



,-h^/2Ao 



(50) 



Optimizing frsb{Bi, B^^r, n = 0), we found a transition at ac — 4.60, which quantitatively coincides with the best 
know RS solution. The corresponding values of the variational parameters are Bi ~ 0.65, B^ — 0.29 and r ~ 0.49. 
Once more, the total fraction of frozen spins is close to 0.94. However, as conjectured, the relative fraction of backbone 
spins increases drastically by a factor of two with respect to the result of Section IV. B. 

The values of ac and Bq predicted using the above mentioned Ansatze are shown figjH] and compared to the results 
of numerical simulations ||^, ^ and the rigorous bounds found in pTf . 
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FIG. 5. Values of ac and Bo at the SAT-UNSAT transition for the different Ansatze presented in Section IV. RSI, RS2 and 
RS3 correspond respectively to the replica symmetric Ansatze with one Gaussian, with one Gaussian and a Dirac peak, and 
with two Gaussians and a Dirac peak. RSBl, RSB2 and RSB3 are their generalizations to the replica symmetry broken case. 
The dashed line gives the best known rigorous upper bound on the value of etc j2^. The dotted line and the circle respectively 
show the values of ol^ and Bq found in numerical simulations Q. Note that the value of ftc is more reliable than the estimate 
of Bo due to the sample sizes used to determine the former [N — 250) and the latter {N = 26). Inset: scaling of Qc(j) as a 
function of l/j^ (where j is the number of terms considered in the series expansion of the effective entropy contribution). 



16 



V. DISCUSSION AND CONCLUSION 



A. A-SAT picture arising from the variational calculation 



The variational calculations of the last two sections lead us to propose the following picture of the 3-SAT problem. 
At very low a each variable Xi is under-constrained, i.e. both SAT instances which result from fixing Xi either to 
true or to false are satisfiable with probability one. By adding new clauses, the number a of constraints per variable 
is increased and the solution space shrinks. The latter is made of a single cluster without any particular internal 
structure. Its diameter d decreases monotonously with the number of clauses, thus signalling a concentration of the 
satisfying assignments in configuration space, see fig. 2. 

When a reaches as — 3.96, the set of all solutions continuously breaks up into an exponential number (in iV) of 
geometrically separated clusters, see fig.^ for a schematic representation. The instance remains nevertheless satisfiable, 
and the variables are still under-constrained. If we further increase the number aN of clauses, the typical distance 
do between clusters remains nearly unchanged. The decrease of the entropy of solutions is thus essentially due to the 
decrease of the average diameter di of the clusters (fig. 2). 

Increasing a the system becomes unsatisfiable with probability one at a certain value ac, i.e. it undergoes a SAT- 
UNSAT transition. In the optimal assignments (which minimize the number of violated clauses), a large fraction 
of variables (approximatively 90%) becomes over-constrained. The mixed nature of the SAT-UNSAT transition can 
be seen explicitely: whereas the fraction of frozen spins jumps up discontinuously, the effective fields measuring the 
strength of the constraints on each variable grow continuously. Moreover the existence of different clusters of optimal 
configurations allows the distinction between two groups of over-constrained variables. The first group (backbone) 
contains variables keeping the same truth value in all optimal configurations. In the second group, the variables have a 
cluster-dependent value. In other words, optimal configurations corresponding to different thruth values of the second 
group variables necessarily belong to distinct clusters and lie at 0{N) distances from each other. 

It is important to note that even in the UNSAT regime these frozen spins coexist with under-constrained variables. 
These unfrozen variables lead to a positive entropy at the transition |2ql, a behavior which is intrinsically different 
from the case of infinite connectivity models | 

Besides, we should mention that the actual cluster distribution could be even more complicated, e.g. through the 
existence of clusters of clusters etc. The existence of only two typical distances, and thus the distinction between two 
kinds of frozen spins is intrinsic to our one-step replica symmetry broken Ansatz. We nevertheless expect that the 
main qualitative features of 3-SAT are already captured in our one-step broken variational description. 

Finally, we note that in the 2 -|-p-SAT case for p < po = 2/5 the picture arising from the variational calculation 
is much simpler. In fact, as and ac coincide and the transition from the under-constrained SAT regime to the 
over-constrained UNSAT phase is smooth. The geometrically non-trivial intermediate phase does not exist at all. 



29|. 




FIG. 6. Schematic representation of the solution space structure of 3-SAT for two values of a with Os < a < Uc resulting 
from the one-step replica symmetry broken variational Ansatz. The solutions (represented by dots) are organized in clusters. 
Whereas the distance do between the clusters remain almost unchanged as a is increased by adding new clauses, the cluster 
size di decreases quickly. 
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B. Critical behavior and exponents 



The exact RS saddle-point equation |^ shows that the probability P{h) that the effective field equals h ^ 1 on a 
given site is bounded from above by the probability that this site is connected to h neighbours and then decreases at 
least exponentially with h. Combining this observation with the variational calculation presented in this paper, an 
investigation of the optimization equations over B and H reveals that slightly above the threshold, the free-energy 
exhibits a singularity of the type 

frs{Sa) ^ Sa ■ {~\og6a)~^ {Sa = a~ac>0) . (51) 

The actual scaling of ^{h) at large fields h gives only rise to logarithmic singularities, e.g. ?? = 5 for a Gaussian 
distribution, r/ = 1 for an exponential one. Note that equation jSl) ) also holds at the RSB level within the Ansatz 

These predictions can be related to the recent finite-size scaling (FSS) numerical studies of the if-SAT model 
Let us call Ecs{oi,N) the average ground state energy for a finite number N of Boolean variables. Close to the 
threshold, we expect the curves of Eos as functions of a obtained for different sizes N to collapse onto each other 
when properly rescaled. In other word, FSS should hold and there should exist some exponents v and 7 such that 

Easia, N) ~ iV^ £{N^/'' 5a) , (52) 

when da — a — ac ^ and N ^ I. v characterizes the smaller and smaller width of the transition region from the 
SAT to the UNSAT phase as N grows. It has been numerically calculated using a Davis-Putnam procedure 0: ~ 3 
for 2-fp-SAT as long as p < po and v decreases for larger p down to ~ 1.5 for 3-SAT. 7 can be simply interpreted: 
TV' is the minimal number of violated clauses at threshold and thus 7 < 1. The rescaled ground state energy E(%j) is 
a monotonously increasing function of its argument: E{y) — > when y —00 (right boundary of the SAT phase); 
£(0) is finite; £{y) ^ y when y ^ 00 (left boundary of the UNSAT phase). The latter scaling ensures that Eqs grows 
above the threshold (that is at fixed Sa while N becomes larger and larger) as 

Easia, N) ■ N^^" ■ Sa , (53) 

to coincide with ( pT] ) up to logarithmic singularities. Imposing that EGs{a,N) = 0{N) in the UNSAT regime, 
identity (^^ gives the hyperscaling relation 

7 = 1 - i . (54) 

Whereas v may be computed for large formulae, involving thousands of variables, no such powerful method exists so 
far to estimate 7. Therefore, identity (M) may be precious to derive indirectly 7 from the knowledge of ly. 



C. Perspectives 

As discussed in Section II. E, the threshold ac separates 0(T) fields (SAT regime) from 0(1) ones (UNSAT phase). 
This change of scaling of effective fields is nicely apparent within the variational calculation presented in Sections III.A- 
B and IV.A-B. Looking at compatible Ansatze for the SAT and the UNSAT phase, the same values for ac can be 
obtained either starting from the SAT phase from a diverging renormalized variance of the effective fields (which were 
assumed to vanish linearly with T), or coming from the UNSAT phase from a vanishing variance of the effective fields 
(which remained finite in the limit T — s- 0). A deeper understanding of the scaling of the fields in the vicinity of the 
critical point a = ac, T = would be of interest for at least two reasons. First, it would allow the calculation of the 
entropy in the UNSAT phase, which has been out of reach yet. Secondly, the structure of the invariant measure P{h) 
could be studied carefully to gain some information on its singularities, its fractal structure, etc. |3^]. At finite but low 
temperatures, one might expect that the support of P{h) includes different regions corresponding to different scalings 
with T, i.e. to distinct physical phenomena coexisting in the model. This potential richness of the order parameter 
cannot be present in infinite-connectivity spin glasses and could give rise to new properties at the mean-field level. 

Further work is clearly required to confirm the geometrical picture of the space of solutions sketched in Section 
V.A. From a numerical point of view, an analysis of the distances between solutions would be of interest to check 
the existence of a non trivial (non necessarily bimodal) distribution for d. It would be also worth trying to improve 
our analytical approach by using richer trial field distributions in the RSB calculations. However, the most promising 
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route is probably to attempt to use the information presently available on the optimal (and quasi-optimal, see Section 
IV. B) assignments of K-SAT to understand the drastic change of behaviour of algorithms close to the threshold. 

Acknowledgements: We are grateful to O. Dubois, S. Kirkpatrick and R. Zecchina for useful discussions, and J. 
Berg for carefully reading the manuscript. M.W. acknowledges financial support by the German Academic Exchange 
Service (DAAD). 

Note added in proofs: After submission of this paper we have been aware of a new rigorous upper bound for 
3-SAT, Qfc < 4.506 proven by O. Dubois. This upper bound lies slightly above our RSB result. 



APPENDIX A: RSB FREE-ENERGY IN THE SAT PHASE 



In this appendix we show how the one-step replica symmetry broken ground state entropy is calculated. We start 
from (|^), perform the limit /3 — > cxd: 



lim c{(t) In c{(t) H In 



It, I It 



(7i_,...,(7K 



(Al) 



and plug in Ansatz (13), that is 
c(a) = 



^ / dzG^, {~Z ^ Z) exp [i j:Z\a^l)m+l ^"j 

dz Gao(^) 11 J dSG A^iS - z) (2 cosh 5)™ 



(A2) 



We will calculate both terms on the r.h.s. of (Al) separately. We start with the effective entropic term and follow 
closely the analytical continuation scheme proposed in ||ll[], 

— lim — c((t) Inc(CT) = — / T>vT>v exp \ —i dy viy) vly) I c[iv\ \uc\iiA 
n^on^ J [ J 



Im 

m 



^^^e-"2'(2coshx)"expi>(i?;) 
27r 



with 



c[v] = / X'pP[p]exp<^ / dy v{y) In 



dh p{h) 



(2cosh/3/i)' 



(A3) 



(A4) 



Note that this form does not depend on K, i.e. on the length of the clauses, and the following calculations are 
consequently valid for any K. With Ansatz ([T^), the last expression depends on v{y) only through its first three 
moments, 



c[v\ = c{vi),vi,V2) = / Gao(2^) exp < — i^o In / dz Ga^{z — z){2 cosh z)'^ + zvi + /S.iV2\ 



(A5) 



with vi = j^ao ~ I7 2). The effective entropic part can now be calculated according to (A3) if we 

introduce the series expansion {/{y) — YlTLo ^iV^ 1^- Using / dy v{y)i'{y) — X^i^o ^i^i integrals over the ui and vi 
with I > 3 can be executed trivially; vi does not exist outside the first exponential leading to a Dirac- function in vi, 
which vanishes consequently. Thus we obtain 



-lim i^c(a)lnc(a) = - /n 

3 1=0 



/ dvi dvi 
V 27r 



c{ivQ,ivi,iv2) lnc(ij/o, ii'i,ii'2) 



Im 

m 

= 50 + 5*1 



^^e-"^(2coshx)™ exp{z)o + iviy - iz>2y'} 



(A6) 
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where 



( 

27r 



vq c(ii/o, 0, 0) lnc(wo, 0, 0) 



-— / dz GAaiz) In / dz G Ai{z — z) (2 cosh zY 
m J J 



(A7) 



and 



Si 
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r- 


m 




X 


In 


1 








m 





_H.g-*(''l»'l+f2£'2) 



c(0, ivi, iv2) In c(0, 11^2) 



(27r)2 

^^e-*-2'(2 coshx)" exp{z:>iy - h^y^} 



_9_ 



Ai 



dz Gao(2) In / c?z G'ai(-S — z) (2coshz)'^ 



(A8) 



By calculating the derivatives and rescaling the integration variables we finally find the corresponding contribution 

in ©. ... 

Let us now calculate the effective energy, i.e. the explicitly a-dependent contribution in (|lj) starting from the last 
term in (Al) by plugging in (^. The sum over the replicated spin variables can easily be carried out. This directly 
gives 



E — — lim — In 

n-tO n 



K 



Widzi GAoizi)) 



1=1 



IIli=i{dSi Ga„{Si - zi j) (H; 2coshz; - ri; 
lUtiidzi Gao{zi - Zi)(2coshzi)") 



(A9) 



In the limit n — > and after a rescaling and translation of the integration variables to normally distributed Gaussian 
variables, we find the corresponding expression in 



APPENDIX B: RS GAUSSIAN ANSATZ FOR THE SAT-UNSAT TRANSITION 



In this appendix, we compute the replica symmetric variational free-energy for a Gaussian distribution $(a;) 
Gi {x) . Using ( p6| ) , we straightforwardly obtain for the if-SAT problem 

»1/(2VA) 



Gauss 



{B, A, a, K) = 2BVA | ln(l-B + B e-"'/^ 



da; 



Dy 



(Bl) 



The corresponding free-energy for the 2-f p-SAT model can be easily obtained by a linear combination of expression(Bl ) 
for A' = 2 (with weight 1 — p) and K = i (with weight p) . 

However for completeness we give a derivation of free-energy (Bl) from (^) without any reference to [0. The order 
parameter ciff) reduces to 



c{a) 



1 - B 



B 



+ 00 



Dx\{ 



^\ 2 cosh /3V Ax 

a— 1 ' ^ 



The first term on the r.h.s. of (^) may then be written in the limit n ^ and /3 ^ 00 as 



(B2) 



(B3) 



The difficulty in the computation of the entropic contribution is the analytic continuation in /. For the r.h.s. of ( p33| ), 
this can be easily achieved through the relation 

da; 

(B4) 



27r 



Using (B4), the sum over I in (B3) can be carried out and expression (Bl) recovered. The last term of (||) is found 
when inserting the RS expression of c((t) in (H) and dividing by n. Using the variational expression (^) of P{h), this 



term reduces to aB^ °° Dxi ■ ■ ■ Dxk niin 
above expression coincides with the last term of (Bl). 



l/VA,2a;i,...,2xK 



when /3 ^ 00. It is easy to check that the 
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APPENDIX C: VARIATIONAL UPPER BOUND TO Po 



For a given distribution <i>, the tricritical point po is obtained through the condition /rs (1/(1 — Po)jPo) — 0- Using 
(^), we obtain 



Am 



Am + Bm 



where 



1 I' + OO 7 



dh [^cc{h)f 



The extremization condition of po with respect to the even distribution $ may thus be written as 

Spo 



S<i>{x] 



[$] = A, Vx > 0, 



(CI) 

(C2) 
(C3) 

(C4) 



where A is a Lagrange multipUer ensuring the normahzation of $. Equation (C4) involves the functional derivatives 
of A and B, 

SA 



/oo poo 
dy<i>{y)[<i>,c{x + y)-^cc{x)]+2 / dy y <f{y)[^Ux - y) - 0{y - x)] (C5) 
-OO J —oo 



5^{x 
SB 



where 9{-) denotes the Heaviside function. By subtracting the values of the functional derivatives of pn (C4) in x = 
and X — oo, the Lagrange multiplier A disappears. We obtain i3[$]/A[<i>] = 3/2 and therefore from (CI), 



min Pom = - 
<s> 5 



(C7) 



The determination of the optimal distribution although of interest, see Section V.B would be more difficult. Note 
that we have implicitely assumed in the functional differentiations (C4.CaJCq) that $ included no Dirac distributions. 
The value of po — 2/5 directly comes from this hypothesis as shown in |10(|. 



APPENDIX D: RSB FREE-ENERGY FOR THE SAT-UNSAT TRANSITION 



Within the RSB Ansatz ([43|), the order parameter c{cf) reduces to: 



c(^) = + 



2" 



Bo 



hJi J^^ Dh{2 cosh f3hy^y'' " J-oc ^J-^2cosh(3hy/\; 



+00 



Dhll 



(Dl) 



In the foll owing, we compute the effective entropy contribution by taking the derivative of 'Ylis'^i^Y with respect to 
I, see (B3). For /? ^ cxd, we find 



S 



/AT d 
fi dl 



(^) (1 - i?: - Bo)'-^ E Q BfB'r' Dh,^, ■ ■ ■ Dh, 

[p=0 g=0 



l(^J Dhi--- Dhqe i^\{hi+-K) + iK+i+-K)r\^ 

/ p~\~00 \ (*~\~00 "I '^ 

q\n(^j Dhe^'^^'^^j-^^{p~q)J Dh\h\^o^j 



(D2) 



1=1 
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where ^ = m(3^/ Ai and r = -y/Ao/Ai. As in the repHca-symmetric computation of appendix B, the main difficulty 
is the analytic continuation in I. For the last two terms in (Dl) the sum over I can be performed explicitly, therefore 
the analytic continuation can be trivially performed. It is easy to check that the contribution to S of these two terms 
leads to the second and the third terms of (^) multiplied by -\/Ai. For the first term of ( |D2| ) the analytic continuation 
in / is more tricky. First of all using the convolution properties of Gaussian functions this term is reduced to 



H dl 



9=0 



(D3) 



1=1 



where the function K{a,b) has been defined in (^).Then the analytic continuation in / can be achieved using the 
function L{x,y) defined in (|4^ , |47| ) and the first term of ( ^5|) (multiplied by \/A7) is recovered. Although written in a 
compact way, the resulting Si is not very useful for numerical purposes. We have rather use the equivalent expression 



Si 



1=1 p=0 



q=Q 



(D4) 



+BolC(fiy/q, tiryjp -q + l) + {I - Bi - Bo)/C(/i^, ^lr^/p~~q) 
We now turn to the effective energy contribution which reads , 

E = -^l Vpi...VpKV[pi]...V[pK] 

P Jv 



X — In 

m 



+ 00 



dhi... dhKPi{hi) . . . pK{hK) | 1 + (e - l)=nf 

Hj^i 2 cosh [}hj 



(D5) 



Plugging the trial variational functional ( [f3[ ) into (D5), we find for /3 ^ oo. 



+ In 



p+oc I 1 

\ 1=1 



II min(l/VAi,2hi,...,2/ig,2r/iq^i,...,2r^R:) 



(D6) 



where 6{h) is the Hcavisidc function. First of all we focus on the q = K term, which is the only one that does not 
vanish for Bi = 1. In this case a simple integration by parts leads to 



aSf VAT 



In 1 



2/i 



l/(2yAi) 



(D7) 



where the function H has been defined in ([isl). The other terms in the sum over q lead to two different contributions. 
For hq+i, . . . , Hk > 1/2-y/Ao by an integration by parts the integrals in the gth term reduce to: 



Dh 



In 1 - 



2p 



e 



(D8) 



On the other hand if there is at least one field among . . . ,hK which is smaller than l/2VAo. After a little 

algebra we find that the integrals in the gth term reduce to 



(K-q) 



Dh 



-\-oo 



Dh 



In 



1 - 



2p 



2^H^{0) Jo 



rh 



(D9) 



Collecting (|Dq ) and (D9) and integrating by parts, we find that the sum over q (for q ^ K) reduces to 
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K-q 




K-q 



Dh' 



2iH{0) - 2fj. /q dh'e-'^t'h' m{h') 



(DIO) 



Finally gathering ( DIP ) and (D7) one obtain the final form of the effective energy part. 

It is easy to verify that both Ai and Aq vanish at the transition while the ratio r — -^/Ao/Ai has a non trivial 
value. Dividing the variational free energy by yfKl the entropic contribution depends on Ai and Aq through r 
alone. Therefore as in the replica symmetric analysis the optimization on Ai has to be performed for the energetic 
contribution only. This procedure leads to Ai = Aq = at the transition. As a consequence within the variational 
approach the analysis of the SAT-UNSAT transition reduces to the study of the variational free energy (Ma). 
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